File Paths and Managing Files

File Paths

Understanding File Paths

File paths are specific locations of files on a computer or web server. They are crucial in programming for accessing, modifying, and organizing files within applications.

Types of File Paths

Relative File Paths

Used to read and write files using the file name alone.
Default to the directory where the Python script is executed.
Preferred for their flexibility across different systems.

Absolute File Paths

Specify the exact location of a file, including the drive name, directory, and file name.
Vary between operating systems:
- Windows: C:/my-directory/target-file.txt
- Mac/Linux: /users/username/my-directory/target-file.txt
Generally avoided due to lack of portability.

Using File Paths in Python

Cross-Platform Compatibility: Use the os.path module to handle differences between operating systems.
Environment Variables: File paths can also reference environment variables, libraries, and other resources.

How to Write File Paths in Code

File Paths and Operating Systems

Windows:
- Uses drive letters and backslashes: C:\my-directory\target-file.txt
- Backslashes are special characters in Python and need to be escaped.
Mac/Linux:
- Use forward slashes and start from the root directory: /users/username/my-directory/target-file.txt

Best Practices

Use Forward Slashes: Even on Windows, using forward slashes (/) avoids issues with escape characters.
- Example: C:/my-directory/target-file.txt
Avoid Absolute Paths: Use relative paths or dynamically construct paths for portability.

The `os` Module in Python

Accessing the Current Working Directory:

import os
current_directory = os.getcwd()

Constructing File Paths:

file_path = os.path.join(current_directory, 'target-file.txt')

Listing Files and Directories:

contents = os.listdir(current_directory)

Examples

Deleting a File

import os

# Delete a file
os.remove('obsolete-file.txt')

Renaming a File

import os

# Rename a file
os.rename('old-name.txt', 'new-name.txt')

Working with Files

File Operations with the `os` Module

Deleting Files: os.remove('filename')
Renaming Files: os.rename('old_name', 'new_name')
Moving Files: Use shutil.move('source', 'destination') from the shutil module.

Checking File Existence

Using os.path.exists():

import os

if os.path.exists('important-file.txt'):
    print('File exists.')
else:
    print('File does not exist.')

More File Information

Getting File Metadata

File Size:

import os

size = os.path.getsize('example.txt')
print(f'File size: {size} bytes')

Last Modification Time:

import os
import datetime

timestamp = os.path.getmtime('example.txt')
modification_time = datetime.datetime.fromtimestamp(timestamp)
print(f'Last modified: {modification_time}')

Working with Timestamps

Unix Timestamps: Represent the number of seconds since January 1, 1970.

Converting Timestamps:

import datetime

timestamp = 1609459200  # Example timestamp
readable_time = datetime.datetime.fromtimestamp(timestamp)
print(readable_time)  # Outputs: 2021-01-01 00:00:00

Absolute Paths

Getting Absolute Paths:

import os

absolute_path = os.path.abspath('relative/path/to/file.txt')
print(absolute_path)

Directories

Working with Directories

Getting the Current Working Directory

import os

current_directory = os.getcwd()
print(f'Current directory: {current_directory}')

Creating Directories

import os

# Create a new directory
os.mkdir('new_directory')

Changing Directories

import os

# Change to a different directory
os.chdir('new_directory')

Removing Directories

Remove Empty Directory:
```
import os

os.rmdir('obsolete_directory')
```

Remove Non-Empty Directory:

import shutil

shutil.rmtree('obsolete_directory')

Listing Directory Contents

import os

# List files and directories
contents = os.listdir('.')
for item in contents:
    print(item)

Differentiating Files and Directories:

import os

for item in os.listdir('.'):
    if os.path.isdir(item):
        print(f'{item}/')
    else:
        print(item)

Constructing File Paths

Using os.path.join():

import os

path = os.path.join('folder', 'subfolder', 'file.txt')
print(path)  # Outputs: folder/subfolder/file.txt

Working with CSV Files Using Pandas

What is a CSV File?

Definition: A Comma Separated Values (CSV) file is a plain text file that uses commas to separate values.
Usage: Commonly used for importing and exporting data for spreadsheets and databases.

Structure:

Name,Department,Salary
Aisha Khan,Engineering,80000
Jules Lee,Marketing,67000
Queenie Corbit,Human Resources,90000

Introduction to Pandas

Pandas: An open-source Python library providing high-performance data manipulation and analysis tools.
Advantages over csv Module:
- Simplifies reading and writing data.
- Handles complex data operations.
- Provides DataFrame objects for easy data manipulation.

Reading CSV Files with Pandas

Importing Pandas

import pandas as pd

Reading a CSV File

# Read the CSV file into a DataFrame
df = pd.read_csv('employees.csv')

# Display the DataFrame
print(df)

Output:

            Name         Department  Salary
   Aisha Khan        Engineering   80000
     Jules Lee          Marketing   67000
Queenie Corbit    Human Resources   90000

Accessing Data

Accessing Columns:

# Get the 'Name' column
names = df['Name']

Iterating Over Rows:

for index, row in df.iterrows():
    print(f"{row['Name']} works in {row['Department']}")

Writing CSV Files with Pandas

Creating a DataFrame

import pandas as pd

# Define data as a dictionary
data = {
    'Name': ['Carlos Rodriguez', 'Li Wei', 'Fatima Zahra'],
    'Department': ['IT', 'Finance', 'Marketing'],
    'Salary': [75000, 82000, 73000]
}

# Create a DataFrame
df = pd.DataFrame(data)

Writing to a CSV File

# Write the DataFrame to a CSV file
df.to_csv('new_employees.csv', index=False)

Resulting new_employees.csv:

Name,Department,Salary
Carlos Rodriguez,IT,75000
Li Wei,Finance,82000
Fatima Zahra,Marketing,73000

Reading and Writing CSV Files with Specific Options

Handling Missing Data

# Read CSV while handling missing values
df = pd.read_csv('employees.csv', na_values=['Not Available', 'NA'])

Specifying Delimiters

# Read a CSV file with semicolon delimiter
df = pd.read_csv('employees.csv', delimiter=';')

Writing without Index

Exclude Index Column:
```
df.to_csv('employees.csv', index=False)
```

Data Manipulation with Pandas

Filtering Data

# Filter employees with salary greater than 80000
high_earners = df[df['Salary'] > 80000]
print(high_earners)

Adding New Columns

# Add a new column for bonus
df['Bonus'] = df['Salary'] * 0.10

Modifying Data

# Increase salary by 5%
df['Salary'] = df['Salary'] * 1.05

Advantages of Using Pandas

Powerful Data Structures: DataFrames and Series.
Easy Data Cleaning: Handling missing data and duplicates.
Data Analysis Tools: Statistical functions and aggregation.
Integration with Other Libraries: Works well with NumPy and Matplotlib.

Practical Example: Processing CSV Data with Pandas

Scenario

You have a CSV file inventory.csv containing inventory data:

Item,Quantity,Price
Laptop,20,1500
Mouse,150,20
Keyboard,85,45
Monitor,40,300

Reading the CSV File

import pandas as pd

# Read the CSV file
inventory = pd.read_csv('inventory.csv')

Calculating Total Inventory Value

# Add a new column for total value per item
inventory['TotalValue'] = inventory['Quantity'] * inventory['Price']

# Calculate the total inventory value
total_inventory_value = inventory['TotalValue'].sum()
print(f'Total Inventory Value: ${total_inventory_value}')

Output:

Total Inventory Value: $60550

Saving the Updated Inventory to a New CSV File

# Save the updated inventory to a new CSV file
inventory.to_csv('updated_inventory.csv', index=False)

Conclusion

Using Pandas for CSV operations provides a robust and efficient way to handle data. It simplifies the process of reading, writing, and manipulating CSV files, making data analysis tasks more straightforward.

Resources for Further Learning:

Pandas Documentation: https://pandas.pydata.org/docs/
Working with CSV Files in Pandas: Real Python Tutorial
Data Analysis with Pandas: Official Pandas Tutorials

File Paths​

Understanding File Paths​

Types of File Paths​

Relative File Paths​

Absolute File Paths​

Using File Paths in Python​

How to Write File Paths in Code​

File Paths and Operating Systems​

Best Practices​

The os Module in Python​

Examples​

Deleting a File​

Renaming a File​

Working with Files​

File Operations with the os Module​

Checking File Existence​

More File Information​

Getting File Metadata​

Working with Timestamps​

Absolute Paths​

Directories​

Working with Directories​

Getting the Current Working Directory​

Creating Directories​

Changing Directories​

Removing Directories​

Listing Directory Contents​

Constructing File Paths​

Working with CSV Files Using Pandas​

What is a CSV File?​

Introduction to Pandas​

Reading CSV Files with Pandas​

Importing Pandas​

Reading a CSV File​

Accessing Data​

Writing CSV Files with Pandas​

Creating a DataFrame​

Writing to a CSV File​

Reading and Writing CSV Files with Specific Options​

Handling Missing Data​

Specifying Delimiters​

Writing without Index​

Data Manipulation with Pandas​

Filtering Data​

Adding New Columns​

Modifying Data​

Advantages of Using Pandas​

Practical Example: Processing CSV Data with Pandas​

Scenario​

Reading the CSV File​

Calculating Total Inventory Value​

Saving the Updated Inventory to a New CSV File​

Conclusion​